Introduction
From 2012-2017, there were 3,687 railroad trespasser fatalities across the United States (Kidda et al. 2020). Previous studies have assessed the trends among trespesser strikes and emphasized that trespasser strikes are an urban problem opposed to a rural problem. I will assess the relationship between population density and trespasser strikes using spatial data science techniques.
As California has the most trespasser fatalities out of any US state, I will be limiting my analysis to California (Kidda et al. 2020). I will use trespasser strike data from the Department of Transportation which includes point data for the latitude and longitude of the strike.
By assessing the relationship between trespasser fatalities in California and population density, I hope to apply my findings to targeted interventions to prevent future trespasser strikes.
Literature Review
The Federal Rail Association (FRA) assessed trends in trespasser train strikes from 2012-2017. California, New York, Florida and Texas had the most trespasser strikes across U.S. states (Kidda et al. 2020). The FRA identified trends among suicides, train types, time of day, age, and individual’s action at the time of death (Kidda et al. 2020). This paper does not assess the relationship between population density and trespasser strikes.
Northwestern economics professor, Ian Savage, notes in his manuscript on Trespassing the Railroad in 2007 that trespasser strikes appear to be an urban problem opposed to a city one as “less than one quarter of fatalities occur outside of town or city limits” (Savage 2007).
Methodology
For population density, I used California census tract data from 2020 using tidycensus and divided the total population by the total area for each tract. For the hypothesis testing, I computed the centroid of these census tracts for my underlying density.
Exploratory Data Analysis (EDA)
The map below shows the 1,528 trespasser fatalities (in purple) California from 2011-2022.
By looking at only census tracts where fatalities have occured, we can see hotspots for strikes in Northern California, specifically around the Richmond and Berkeley area as well as around the Davis area and Modesto area. There appear to be fewer strikes in southern California but there appears to be a cluster around the Pomona and Ontario area.
Global Moran’s I
Using Moran’s I for spatial autocorrelation, we can determine if the data is clustered and we can use Local Moran’s I to identify where these clusters lie. I used all of the census tracts across California, even those without strikes, in order to access clustering.
The result from Moran’s I test is displayed below.
[1] 0.15995
A value of roughly 0.16 indicates a slight positive autocorrelation so we can conclude that nearby census tracts have slightly similar numbers of strikes.
Local Moran’s I
Local Moran’s I test identifies areas where strikes are clustered together. The results of Local Moran’s I test can be shown in the map below.